51 research outputs found

    What Others Say About This Work? Scalable Extraction of Citation Contexts from Research Papers

    Get PDF
    This work presents a new, scalable solution to the problem of extracting citation contexts: the textual fragments surrounding citation references. These citation contexts can be used to navigate digital libraries of research papers to help users in deciding what to read. We have developed a prototype system which can retrieve, on-demand, citation contexts from the full text of over 15 million research articles in the Mendeley catalog for a given reference research paper. The evaluation results show that our citation extraction system provides additional functionality over existing tools, has two orders of magnitude faster runtime performance, while providing a 9% improvement in F-measure over the current state-of-the-art

    Deep Sequential Models for Task Satisfaction Prediction

    Get PDF
    Detecting and understanding implicit signals of user satisfaction are essential for experimentation aimed at predicting searcher satisfaction. As retrieval systems have advanced, search tasks have steadily emerged as accurate units not only to capture searcher's goals but also in understanding how well a system is able to help the user achieve that goal. However, a major portion of existing work on modeling searcher satisfaction has focused on query level satisfaction. The few existing approaches for task satisfaction prediction have narrowly focused on simple tasks aimed at solving atomic information needs. In this work we go beyond such atomic tasks and consider the problem of predicting user's satisfaction when engaged in complex search tasks composed of many different queries and subtasks. We begin by considering holistic view of user interactions with the search engine result page (SERP) and extract detailed interaction sequences of their activity. We then look at query level abstraction and propose a novel deep sequential architecture which leverages the extracted interaction sequences to predict query level satisfaction. Further, we enrich this model with auxiliary features which have been traditionally used for satisfaction prediction and propose a unified multi-view model which combines the benefit of user interaction sequences with auxiliary features. Finally, we go beyond query level abstraction and consider query sequences issued by the user in order to complete a complex task, to make task level satisfaction predictions. We propose a number of functional composition techniques which take into account query level satisfaction estimates along with the query sequence to predict task level satisfaction. Through rigorous experiments, we demonstrate that the proposed deep sequential models significantly outperform established baselines at both query and task satisfaction prediction. Our findings have implications on metric development for gauging user satisfaction and on designing systems which help users accomplish complex search tasks

    Protocol for the development of guidance for stakeholder engagement in health and healthcare guideline development and implementation

    Get PDF
    Stakeholder engagement has become widely accepted as a necessary component of guideline development and implementation. While frameworks for developing guidelines express the need for those potentially affected by guideline recommendations to be involved in their development, there is a lack of consensus on how this should be done in practice. Further, there is a lack of guidance on how to equitably and meaningfully engage multiple stakeholders. We aim to develop guidance for the meaningful and equitable engagement of multiple stakeholders in guideline development and implementation. METHODS: This will be a multi-stage project. The first stage is to conduct a series of four systematic reviews. These will (1) describe existing guidance and methods for stakeholder engagement in guideline development and implementation, (2) characterize barriers and facilitators to stakeholder engagement in guideline development and implementation, (3) explore the impact of stakeholder engagement on guideline development and implementation, and (4) identify issues related to conflicts of interest when engaging multiple stakeholders in guideline development and implementation. DISCUSSION: We will collaborate with our multiple and diverse stakeholders to develop guidance for multi-stakeholder engagement in guideline development and implementation. We will use the results of the systematic reviews to develop a candidate list of draft guidance recommendations and will seek broad feedback on the draft guidance via an online survey of guideline developers and external stakeholders. An invited group of representatives from all stakeholder groups will discuss the results of the survey at a consensus meeting which will inform the development of the final guidance papers. Our overall goal is to improve the development of guidelines through meaningful and equitable multi-stakeholder engagement, and subsequently to improve health outcomes and reduce inequities in health

    The CHEMDNER corpus of chemicals and drugs and its annotation principles

    Get PDF
    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus

    Ventilation Techniques and Risk for Transmission of Coronavirus Disease, Including COVID-19 A Living Systematic Review of Multiple Streams of Evidence

    Get PDF
    Background: Mechanical ventilation is used to treat respiratory failure in coronavirus disease 2019 (COVID-19). Purpose: To review multiple streams of evidence regarding the benefits and harms of ventilation techniques for coronavirus infections, including that causing COVID-19. (PROSPERO registration: CRD42020178187) Data Sources: 21 standard, World Health Organization–specific and COVID-19–specific databases, without language restrictions, until 1 May 2020. Study Selection: Studies of any design and language comparing different oxygenation approaches in patients with coronavirus infections, including severe acute respiratory syndrome (SARS) or Middle East respiratory syndrome (MERS), or with hypoxemic respiratory failure. Animal, mechanistic, laboratory, and preclinical evidence was gathered regarding aerosol dispersion of coronavirus. Studies evaluating risk for virus transmission to health care workers from aerosol-generating procedures (AGPs) were included. Data Extraction: Independent and duplicate screening, data abstraction, and risk of bias assessment (GRADE for certainty of evidence and AMSTAR 2 for included systematic reviews). Data Synthesis: 123 studies were eligible (45 on COVID-19, 70 on SARS, 8 on MERS), but only 5 studies (1 on COVID-19, 3 on SARS, 1 on MERS) adjusted for important confounders. A study in hospitalized patients with COVID-19 reported slightly higher mortality with noninvasive ventilation (NIV) than with invasive mechanical ventilation (IMV), but 2 opposing studies, 1 in patients with MERS and 1 in patients with SARS, suggest a reduction in mortality with NIV (very low-certainty evidence). Two studies in patients with SARS report a reduction in mortality with NIV compared with no mechanical ventilation (low-certainty evidence). Two systematic reviews suggest a large reduction in mortality with NIV compared with conventional oxygen therapy. Other included studies suggest increased odds of transmission from AGPs. Limitation: Direct studies in COVID-19 are limited and poorly reported. Conclusion: Indirect and low-certainty evidence suggests that use of NIV, similar to IMV, probably reduces mortality but may increase the risk for transmission of COVID-19 to health care workers

    Same data, different conclusions: Radical dispersion in empirical results when independent analysts operationalize and test the same hypothesis

    Get PDF
    In this crowdsourced initiative, independent analysts used the same dataset to test two hypotheses regarding the effects of scientists’ gender and professional status on verbosity during group meetings. Not only the analytic approach but also the operationalizations of key variables were left unconstrained and up to individual analysts. For instance, analysts could choose to operationalize status as job title, institutional ranking, citation counts, or some combination. To maximize transparency regarding the process by which analytic choices are made, the analysts used a platform we developed called DataExplained to justify both preferred and rejected analytic paths in real time. Analyses lacking sufficient detail, reproducible code, or with statistical errors were excluded, resulting in 29 analyses in the final sample. Researchers reported radically different analyses and dispersed empirical outcomes, in a number of cases obtaining significant effects in opposite directions for the same research question. A Boba multiverse analysis demonstrates that decisions about how to operationalize variables explain variability in outcomes above and beyond statistical choices (e.g., covariates). Subjective researcher decisions play a critical role in driving the reported empirical results, underscoring the need for open data, systematic robustness checks, and transparency regarding both analytic paths taken and not taken. Implications for organizations and leaders, whose decision making relies in part on scientific findings, consulting reports, and internal analyses by data scientists, are discussed

    Price of knowledge prompts reflection

    No full text
    corecore